Each any every one of us will at a point late in our lives contemplate a move to a retirement home, or the need to help our elderly parents find a retirement home.
But which one to choose?
Lets take a look at the Canterbury Gardens, an licensed assisted living and memory care home with a capacity for up to 120 residents.
Looking good. Their advertisement video has a professional voiceover, a panning drone arial view, and shots of clean living areas with game tables. Their blurb assures us that we will find a warm, dedicated staff committed to creating a community you will be delighted to call home.
Would you want to live there?
What if I told you that in 7/28/2014 it was found that the kitchen had been swarming with cockroaches for up to six months, that the staff weren’t using disinfectant to clean the tables, that dead, bloated mice were left in traps for days on end, that crates of rotten and moldy potatos and onions covered in cockroaches had been taken from to make the lunch served the very day the inspector arrived? That residents had been reporting being served cold food on sticky, greasy tables. That interviewed staff sobbed while retelling how they were unable to solve the problem and had cockroaches crawling up their arms while doing dishes because the administrator wouldn’t pay the exterminator company. That similar sanitation issues had been cited in a 2013 health survey, and that while corrective actions were taken this location has remained licensed and there is no indication that the administator in charge was fired. The kinds of cost-cutting behavior that led to this state of affairs is seen repeated in a recent survey conducted in 2016.
It’s all detailed within this report: http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl3.aspx?tg=0706&eid=UTSS11&ft=pcbhpp&id=2304B1&bdg=00®=ALR6
Would you still want to live there? I wouldn’t. However a cursory search of Canterbury Gardens online won’t tell you that this ever happened. It’s only by looking through the Colorado Department of Public Health and Environment’s database while employed at a company that specializes in elder care that I ever discovered this had occured.
In this database you can find some of the worst snapshots of human misery imaginable. People dying covered in their own diarrhea after hours of violent illness because a staff member didn’t want to help clean them or think to call for medical assistance. People being sexually and physically abused by other residents and told to shut up or called liars when trying to report it to staff. A person catching fire and dying because a staff member let them smoke while they had an oxygen mask on. It’s grotesque and horrible stuff that people should have a right to know about when they are looking for a place they or their love ones can live when they’re too old to take care of themselves, but this information unfortunately is in a nondescript state database.
In this tutorial I will be scraping and conducting surface-level analyses of the Colorado Department of Public Health and Environment’s database for licensed elderly care homes. I will also be creating a leaflet map which will allow anyone to click on elderly home locations and review the survey data on the location.
This database includes roughly 650 locations and has for each location information including the number of beds licensed for that location, the type of ownership the location has, the address of the location, the phone number of the location, and the number of branches the location has. Of most interest to us, this database has a record of every public health survey that the state has conducted on each individual location. These surveys include incident reports/citations that each have a severity/scope rating as well as the written report of the citations themselves.
This tutorial will be split into three sections:
Section 1: The Big Scrape In this section I will be scraping the multi-layered database and gathering the results into three tables- the first for locations, the second for surveys, and the third for individual incidents.
Section 2: Tidying and Analysis In this section I will be tidying up the data results and conducting a linear regression analysis as well as plotting relationships.
Section 3: Location Map with Leaflet In this section I will be creating a leaflet map which will display locations that can be clicked to find out their specifications.
To start we will need our toolkit. In this tutorial I will be using a package called ‘webdriver’ and a browser driver called ‘phantomjs’. This will be installed in the following code block. We will also initialize our crawler.
library(rvest)
## Loading required package: xml2
library(tidyverse)
## ── Attaching packages ───────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 2.2.1 ✔ purrr 0.2.4
## ✔ tibble 1.4.2 ✔ dplyr 0.7.4
## ✔ tidyr 0.8.0 ✔ stringr 1.2.0
## ✔ readr 1.1.1 ✔ forcats 0.2.0
## ── Conflicts ──────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ readr::guess_encoding() masks rvest::guess_encoding()
## ✖ dplyr::lag() masks stats::lag()
## ✖ purrr::pluck() masks rvest::pluck()
require(webdriver)
## Loading required package: webdriver
install_phantomjs()
## phantomjs has been installed to /Users/jbunker/Library/Application Support/PhantomJS
pjs <- run_phantomjs()
ses <- Session$new(port = pjs$port)
By directing our crawler to the url of the database we can find the following screen:
url <- "http://www.hfemsd2.dphe.state.co.us/hfd2003/homebase.aspx?Ftype=pcbhpp&Do=srch"
ses$go(url)
ses$takeScreenshot()
This is the top level of the database, however this page is merely a gate behind which we can find the tasty data we are looking for. In order to pass this gate we will be clicking the “Start Search” button with the following code block.
search <- ses$findElement("#SubmitCriteria")
search$sendKeys("", key$enter)
searchlink <- ses$getUrl()
Now we should be seeing the following:
The full screenshot it too large to show, but it extends several pages down. This page contains a table with links to every location in the database.
We will now scrape each of the elements in this table and get their href attributes. This will allow us to have links and names for every location in the database.
facilityList <- ses$findElement("#Faclist")
locations <- facilityList$findElements("a")
df <- data.frame(Doubles=double(),
Ints=integer(),
Factors=factor(),
Logicals=logical(),
Characters=character(),
stringsAsFactors=FALSE)
hrefs <- c()
names <- c()
z <- 1
while(z <= length(locations)) {
hrefs[[z]] <- locations[[z]]$getAttribute("href")
names[[z]] <- locations[[z]]$getText()
z <- z + 1
}
print(length(hrefs))
## [1] 704
Our scrape picked up 704 locations in the database. Upon investigation one can find that around fifty of these are links to the same page. This is because some locations have unique names for different branches, however they are given the same unique ID within the database, and surveys are conducted for the this unique entity. Later we will eliminiate these duplicates.
Next we will be looking under the hood of the links we have gathered. For example, here is the page for the first location in our database:
ses$go(hrefs[1])
ses$takeScreenshot()
We can see here that there are three different sections of interest. The top section includes all of the general information about the location. This location is called ‘A DOCTOR’S TOUCH LLC’. It is administated by a Mr. Hearald Ostovar, it has eight licensed beds - a small location - and it’s ownership type is ‘limited liability’. The second section is titled ‘Occurences’. This includes a list of every time a report was made by the location itself regarding an incident. Picking out a few occurences at random under this link I could find reports of a fight between residents, an unsolved petty theft, and an account of a resident with dementia picked up by police wandering around a Wallgreens. Unfortunately these occurences are not given any rating, and we will be leaving them out of this tutorial. The third section contains the meat of the database. The health surveys conducted by government employees. These surveys are conducted either to reaffirm the licensability of the location or in response to reports received about mismanagement or mistreatment at the facility. They are broken into two types- ‘Health Surveys’, and ‘Life Safety Surveys’. Both of these survey types have the same format, and so we will be compiling them into the same table, but within the HTML they are listed in different tables.
In the following code block we will be going through each location link and scraping the dates and href links of each health or life-safety survey at each location.
library(foreach)
##
## Attaching package: 'foreach'
## The following objects are masked from 'package:purrr':
##
## accumulate, when
snames <- c() #health survey location names
lsnames <- c() #life survey location names
accessTimes <- c() #the time at which this page was accessed by our scraper
demographies <- c() #addresses, phone numbers, etc
occurrencesHrefs <- c() #the links to the occurences page
occurrencesTexts <- c()
healthSurveyCount <- c() #the number of health surveys per location
healthSurveysHrefs <- c() #the links to each health survey
healthSurveysDates <- c() #the dates of each health survey
lifeSafetySurveyCount <- c() #the number of life safety surveys per location
lifeSafetySurveyHrefs <- c() #the links to each life safety survey
lifeSafetySurveyDates <- c() #the dates of each life safety survey
z <- 1
while(z <= length(locations)) {
url <- hrefs[[z]]
ses$go(url)
datetime <- ses$findElement("#HeaderDetail_DisplayDateTime")
datetime <- datetime$getText()
accessTimes[[z]] <- datetime
demography <- ses$findElement("#DemogData")
demography <- demography$getText()
demographies[[z]] <- demography
occurrences <- ses$findElement("#OccData")
occurrencesHref <- occurrences$findElements("a")
occurrencesHref <- occurrencesHref[[1]]$getAttribute("href")
occurrencesHrefs <- occurrencesHref
occurrencesText <- occurrences$getText()
occurrencesTexts[[z]] <- occurrencesText
surveysTab <- ses$findElement('#SurvTab')
healthSurveys <- surveysTab$findElement('#SurvHealth')
healthSurveys <- healthSurveys$findElements('a')
foreach(i=healthSurveys) %do% {healthSurveysHrefs <- c(healthSurveysHrefs, i$getAttribute("href"))}
foreach(i=healthSurveys) %do% {healthSurveysDates <- c(healthSurveysDates, i$getText())}
foreach(i=healthSurveys) %do% {snames <- c(snames, names[[z]])}
healthSurveyCount[[z]] <- length(healthSurveys)
SurvLSC <- surveysTab$findElement('#SurvLSC')
lifeSafetySurveys <- SurvLSC$findElements('a')
foreach(i=lifeSafetySurveys) %do% {lifeSafetySurveyHrefs <- c(lifeSafetySurveyHrefs, i$getAttribute("href"))}
foreach(i=lifeSafetySurveys) %do% {lifeSafetySurveyDates <- c(lifeSafetySurveyDates, i$getText())}
foreach(i=lifeSafetySurveys) %do% {lsnames <- c(lsnames, names[[z]])}
lifeSafetySurveyCount[[z]] <- length(lifeSafetySurveys)
z <- z + 1
}
print(sum(lifeSafetySurveyCount) + sum(healthSurveyCount))
## [1] 3126
We have now scraped around 3150 different health and life safety surveys.
Now that we have the links to each survey, we will need to scrape the incidents recorded within each survey. A typical survey page looks like this:
ses$go(healthSurveysHrefs[4])
ses$takeScreenshot()
Above you can see the health survey conducted for ‘CANTERBURY GARDENS INDEPENDENT AND ASSISTED LIVING’ performed on 9/16/2015. Each survey has an initial comments section- surveys that find nothing wrong will have an initial comments section and nothing else. There are three citations for this survey. The table contains four rows- one for the summary of the regulation cited, one for the scope of the problem, one for the severity of the problem, and a column with a letter grade for the scope/severity of the problem. We will be collecting the values within each of these columns for each row.
The letter rating has a common relationship with the severity:
Potential harm to the resident(s) - A/B Actutal harm to the resident(s) - C/D Life threatening to the resident(s) - E
Wherein a citation for ‘Actutal harm to the resident(s)’ that only affects a couple residents will receive a ‘C’ grade, whereas one that affects many residents will receive a ‘D’ grade.
In the following code block we will be going through each survey link and collecting the information about the incident reports.
incidentSurveyIDs <- c() #the names of the locations that the survey/incident was conducted at
incidentCount <- c() #number of incidents per survey
reportHrefs <- c() #link to the incident report
reportTitles <- c() #title of the incident report
reportScopes <- c() #scope of the incident
reportSeverities <- c() #severity of the incident
reportLevels <- c() #grade level of the incident
reportDates <- c() #date of the incident report
incidentType <- c() #initial comment, health survey incident, or life safety incident
z <- 1
while(z <= length(healthSurveysHrefs)) {
incidentSurveyID <- snames[[z]]
reportDate <- healthSurveysDates[[z]]
ses$go(healthSurveysHrefs[[z]])
incidentTable <- ses$findElement("#TagList")
incidentTableRows <- incidentTable$findElements('tr')
incidentTableRows[[1]] <- NULL
incidentCount[[z]] <- length(incidentTableRows) - 1
z1 <- 1
while(z1 <= length(incidentTableRows)) {
report <- incidentTableRows[[z1]]$findElement('a')
reportHrefs <- c(reportHrefs, report$getAttribute("href"))
reportTitles <- c(reportTitles, report$getText())
columns <- incidentTableRows[[z1]]$findElements('td')
reportScopes <- c(reportScopes, columns[[2]]$getText())
reportSeverities <- c(reportSeverities, columns[[3]]$getText())
reportLevels <- c(reportLevels, columns[[4]]$getText())
incidentSurveyIDs <- c(incidentSurveyIDs, incidentSurveyID)
reportDates <- c(reportDates, reportDate)
if(z1 == 1) {
incidentType <- c(incidentType ,"Initial Comment")
} else
{
incidentType <- c(incidentType ,"Health Survey Report")
}
z1 <- z1 + 1
}
z <- z + 1
}
z <- 1
while(z <= length(lifeSafetySurveyHrefs)) {
incidentSurveyID <- lsnames[[z]]
reportDate <- lifeSafetySurveyDates[[z]]
ses$go(lifeSafetySurveyHrefs[[z]])
incidentTable <- ses$findElement("#TagList")
incidentTableRows <- incidentTable$findElements('tr')
incidentTableRows[[1]] <- NULL
incidentCount[[z]] <- length(incidentTableRows) - 1
z1 <- 1
while(z1 <= length(incidentTableRows)) {
report <- incidentTableRows[[z1]]$findElement('a')
reportHrefs <- c(reportHrefs, report$getAttribute("href"))
reportTitles <- c(reportTitles, report$getText())
columns <- incidentTableRows[[z1]]$findElements('td')
reportScopes <- c(reportScopes, columns[[2]]$getText())
reportSeverities <- c(reportSeverities, columns[[3]]$getText())
reportLevels <- c(reportLevels, columns[[4]]$getText())
incidentSurveyIDs <- c(incidentSurveyIDs, incidentSurveyID)
reportDates <- c(reportDates, reportDate)
if(z1 == 1) {
incidentType <- c(incidentType ,"Initial Comment")
} else
{
incidentType <- c(incidentType ,"Life Safety Report")
}
z1 <- z1 + 1
}
z <- z + 1
}
#create the data frames out of the disparate lists.
locationsdf <- data.frame(names <- names,
tophrefs <- hrefs,
demographies <- demographies,
occurencesHrefs <- occurrencesHrefs,
occurencesTexts <- occurrencesTexts,
healthSurveyCount <- healthSurveyCount,
lifeSafetySurveyCount <- lifeSafetySurveyCount,
accessTimes <- accessTimes,
stringsAsFactors = FALSE)
healthsurveydf <- data.frame(location <- snames,
healthSurveysDates <- healthSurveysDates,
healthSurveysHrefs <- healthSurveysHrefs,
incidentCount <- incidentCount,
stringsAsFactors = FALSE)
incidentdf <- data.frame(location <- incidentSurveyIDs,
incidentType <- incidentType,
reportTitles <- reportTitles,
reportScope <- reportScopes,
reportSeverity <- reportSeverities,
reportLevels <- reportLevels,
reportDates <- reportDates,
reportHrefs <- reportHrefs,
stringsAsFactors = FALSE)
At this point we have now scraped all of the information that we will be gathering from this database for this tutorial. All of the incidents have been recorded and stored within our new data frames.
In the following code block we will be tidying up the data through a series of steps which will make it easier to analyze.
incidents <- filter(incidentdf, reportLevels != '') #filter out initial reports
locations <- locationsdf
surveys <- healthsurveydf
z <- 1
while(z <= nrow(locations)) {
locations[z,'id'] <- (strsplit((strsplit(locations[z,'tophrefs....hrefs'], "id="))[[1]][2], "&ft="))[[1]][1]
z <- z + 1 #extract unique location ids from href link
}
locations <- locations[!duplicated(locations$id), ]
#get rid of the duplicate locations, ex: locations with multiple branches under different names
incidents <- incidents[!duplicated(incidents$reportHrefs....reportHrefs), ]
#get rid of the duplicate incidents, ex: incidents reported in multiple branches of same location
z <- 1
while(z <= nrow(incidents)) {
incidents[z,'id'] <- (strsplit((strsplit(incidents[z,'reportHrefs....reportHrefs'], "pcbhpp&id="))[[1]][2], "&bdg="))[[1]][1]
z <- z + 1
} #extract location id for each incident to create matching identifier
z <- 1
while(z <= nrow(locations)) {
temp <- (strsplit((strsplit(locations[z,'demographies....demographies'], "Ownership type: "))[[1]][2], "\nOmbudsman Phone:"))[[1]][1]
temp <- (strsplit(temp, "\nCurrent ownership effective"))[[1]][1]
locations[z,'ownership'] <- temp
z <- z + 1
} #extract location ownership type
z <- 1
while(z <= nrow(locations)) {
temp <- (strsplit((strsplit(locations[z,'demographies....demographies'], "Licensed Beds: "))[[1]][2], "\nOwnership type: "))[[1]][1]
temp <- (strsplit(temp, "\nSecured Beds: "))[[1]][1]
locations[z,'beds'] <- as.numeric(temp)
z <- z + 1
} #extract licensed bed count
incidents <- incidents %>%
type_convert(col_types = cols(
reportDates....reportDates = col_datetime(format = "%m/%d/%Y")
),
na = c("", "NA"),
locale = default_locale(),
trim_ws = TRUE
) #transform into datetime format
tm1 <- as.POSIXct("2014-05-18") #four years from the day this project was uploaded.
z <- 1
while(z <= nrow(locations)) {
target <- locations[z, "id"]
targetincidents <- filter(incidents, id==target, reportDates....reportDates > tm1)
#filter out reports which occurred more than four years ago to get a more accurate look at the current state of the location
Acount <- nrow(filter(targetincidents, reportLevels....reportLevels=="A"))
Bcount <- nrow(filter(targetincidents, reportLevels....reportLevels=="B"))
Ccount <- nrow(filter(targetincidents, reportLevels....reportLevels=="C"))
Dcount <- nrow(filter(targetincidents, reportLevels....reportLevels=="D"))
Ecount <- nrow(filter(targetincidents, reportLevels....reportLevels=="E"))
locations[z, 'A'] <- as.numeric(Acount)
locations[z, 'B'] <- as.numeric(Bcount)
locations[z, 'C'] <- as.numeric(Ccount)
locations[z, 'D'] <- as.numeric(Dcount)
locations[z, 'E'] <- as.numeric(Ecount)
z <- z + 1
}
head(locations)
## names....names
## 1 A DOCTOR'S TOUCH LLC
## 2 A FEATHERED NEST AT THORNTON
## 3 A LEGACY PERSONAL CARE HOME
## 4 A LOVING HAND ASSISTED LIVING INC
## 5 A ROBIN'S NEST
## 6 A WILDFLOWER ASSISTED LIVING AND CARE HOME INC
## tophrefs....hrefs
## 1 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=23G941&ft=pcbhpp
## 2 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=2304JA&ft=pcbhpp
## 3 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=23R668&ft=pcbhpp
## 4 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=2304ZS&ft=pcbhpp
## 5 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=23052H&ft=pcbhpp
## 6 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=23J158&ft=pcbhpp
## demographies....demographies
## 1 A DOCTOR'S TOUCH LLC\n1550 HIAWATHA DRIVE\nCOLORADO SPRINGS, CO 80915 - EL PASO COUNTY\nTelephone: (719)638-8198, Fax: (719)694-3969\n\nAdministrator/Contact: Mr HERALD OSTOVAR\nAssisted Living Residence - Medicaid Certified\nLicensed Beds: 8\nOwnership type: LIMITED LIABILITY\nCurrent ownership effective: 1/14/2008\nOmbudsman Phone: (719)471-7080
## 2 A FEATHERED NEST AT THORNTON\n11540 MILWAUKEE ST\nTHORNTON, CO 80233 - ADAMS COUNTY\nTelephone: (303)453-0810, Fax: (303)453-0810\n\nAdministrator/Contact: MS ANNETTE FARRELL\nAssisted Living Residence - Medicaid Certified\nLicensed Beds: 7\nOwnership type: PROFIT-CORPORATION\nCurrent ownership effective: 10/2/1998\nOmbudsman Phone: (303)480-5624
## 3 A LEGACY PERSONAL CARE HOME\n4050 SOUTH FOX STREET\nENGLEWOOD, CO 80110 - ARAPAHOE COUNTY\nTelephone: (303)783-4989, Fax: (303)635-6719\n\nAdministrator/Contact: Mr SCOTT BOYLE\nAssisted Living Residence - Private Pay\nLicensed Beds: 8\nOwnership type: PROFIT-CORPORATION\nCurrent ownership effective: 7/21/2015\nOmbudsman Phone: (303)480-5624
## 4 A LOVING HAND ASSISTED LIVING INC\n3079 S HOLLY PLACE\nDENVER, CO 80222 - ARAPAHOE COUNTY\nTelephone: (303)782-6951, Fax: (303)504-6038\n\nAdministrator/Contact: Ms JANNELLE MOLINA\nAssisted Living Residence - Medicaid Certified\nLicensed Beds: 8\nOwnership type: PROFIT-CORPORATION\nCurrent ownership effective: 5/8/2007\nOmbudsman Phone: (303)480-5624
## 5 A ROBIN'S NEST\n3182 E OAK CREEK DR\nCOLORADO SPRINGS, CO 80906 - EL PASO COUNTY\nTelephone: (719)226-2941, Fax: (719)226-2942\n\nAdministrator/Contact: MS MARY LOUISE KOURI\nAssisted Living Residence - Private Pay\nLicensed Beds: 8\nOwnership type: PROFIT-CORPORATION\nCurrent ownership effective: 2/23/2000\nOmbudsman Phone: (719)471-7080
## 6 A WILDFLOWER ASSISTED LIVING AND CARE HOME INC\n8417 W 74TH PLACE\nARVADA, CO 80005 - JEFFERSON COUNTY\nTelephone: (303)456-7890, Fax: (866)941-5820\n\nAdministrator/Contact: Ms NICOLE SCHIAVONE\nAssisted Living Residence - Medicaid Certified\nLicensed Beds: 8\nOwnership type: PROFIT-CORPORATION\nCurrent ownership effective: 2/8/2018\nOmbudsman Phone: (303)480-5624\n\nMail c/o: A WILDFLOWER ASSISTED LIVING\n1140 US HWY 281, STE 400-287\nBROOMFIELD,CO 80020
## occurencesHrefs....occurrencesHrefs
## 1 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtlocc06.aspx?id=2304A9&ft=pcbhpp
## 2 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtlocc06.aspx?id=2304A9&ft=pcbhpp
## 3 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtlocc06.aspx?id=2304A9&ft=pcbhpp
## 4 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtlocc06.aspx?id=2304A9&ft=pcbhpp
## 5 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtlocc06.aspx?id=2304A9&ft=pcbhpp
## 6 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtlocc06.aspx?id=2304A9&ft=pcbhpp
## occurencesTexts....occurrencesTexts
## 1 OCCURRENCES:\n\nOccurrence Investigative Reports Released from 5/18/2015 to 5/18/2018 for A DOCTOR'S TOUCH LLC\n\nAbout the Occurrence Reporting System\nNote to Consumers
## 2 OCCURRENCES:\n\nOccurrence Investigative Reports Released from 5/18/2015 to 5/18/2018 for A FEATHERED NEST AT THORNTON\n\nAbout the Occurrence Reporting System\nNote to Consumers
## 3 OCCURRENCES:\n\nOccurrence Investigative Reports Released from 5/18/2015 to 5/18/2018 for A LEGACY PERSONAL CARE HOME\n\nAbout the Occurrence Reporting System\nNote to Consumers
## 4 OCCURRENCES:\n\nOccurrence Investigative Reports Released from 5/18/2015 to 5/18/2018 for A LOVING HAND ASSISTED LIVING INC\n\nAbout the Occurrence Reporting System\nNote to Consumers
## 5 OCCURRENCES:\n\nOccurrence Investigative Reports Released from 5/18/2015 to 5/18/2018 for A ROBIN'S NEST\n\nAbout the Occurrence Reporting System\nNote to Consumers
## 6 OCCURRENCES:\n\nOccurrence Investigative Reports Released from 5/18/2015 to 5/18/2018 for A WILDFLOWER ASSISTED LIVING AND CARE HOME INC\n\nAbout the Occurrence Reporting System\nNote to Consumers
## healthSurveyCount....healthSurveyCount
## 1 5
## 2 3
## 3 3
## 4 3
## 5 4
## 6 2
## lifeSafetySurveyCount....lifeSafetySurveyCount
## 1 1
## 2 1
## 3 1
## 4 1
## 5 1
## 6 0
## accessTimes....accessTimes id ownership beds A B C D E
## 1 Friday, May 18, 2018 1:48 AM 23G941 LIMITED LIABILITY 8 5 12 0 0 0
## 2 Friday, May 18, 2018 1:48 AM 2304JA PROFIT-CORPORATION 7 2 4 0 0 0
## 3 Friday, May 18, 2018 1:48 AM 23R668 PROFIT-CORPORATION 8 0 5 0 0 0
## 4 Friday, May 18, 2018 1:48 AM 2304ZS PROFIT-CORPORATION 8 8 2 0 0 0
## 5 Friday, May 18, 2018 1:48 AM 23052H PROFIT-CORPORATION 8 0 1 0 0 0
## 6 Friday, May 18, 2018 1:48 AM 23J158 PROFIT-CORPORATION 8 0 4 0 0 0
head(incidents)
## location....incidentSurveyIDs incidentType....incidentType
## 1 A DOCTOR'S TOUCH LLC Health Survey Report
## 2 A DOCTOR'S TOUCH LLC Health Survey Report
## 3 A DOCTOR'S TOUCH LLC Health Survey Report
## 4 A DOCTOR'S TOUCH LLC Health Survey Report
## 5 A DOCTOR'S TOUCH LLC Health Survey Report
## 6 A DOCTOR'S TOUCH LLC Health Survey Report
## reportTitles....reportTitles
## 1 0004-ALR Compliance with Chapter VII ALR Regs
## 2 0006-ALR Compliance w/ Chapter II Gen Lic Regs
## 3 0034-ALR Compliance w/ Occurrence Report Req
## 4 0410-P&P consistent with regimen taught in class
## 5 0506-ALR Activities - Opport W/in and Outside Fac.
## 6 0556-ALR Meds - Admin Only Upon Current Orders
## reportScope....reportScopes reportSeverity....reportSeverities
## 1 Pattern Potential harm to the resident(s)
## 2 Pattern Potential harm to the resident(s)
## 3 Isolated Potential harm to the resident(s)
## 4 Pattern Potential harm to the resident(s)
## 5 Pattern Potential harm to the resident(s)
## 6 Isolated Potential harm to the resident(s)
## reportLevels....reportLevels reportDates....reportDates
## 1 B 2016-10-01
## 2 B 2016-10-01
## 3 A 2016-10-01
## 4 B 2016-10-01
## 5 B 2016-10-01
## 6 A 2016-10-01
## reportHrefs....reportHrefs
## 1 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl3.aspx?tg=0004&eid=AEWV11&ft=pcbhpp&id=23G941&bdg=00®=ALR7
## 2 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl3.aspx?tg=0006&eid=AEWV11&ft=pcbhpp&id=23G941&bdg=00®=ALR7
## 3 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl3.aspx?tg=0034&eid=AEWV11&ft=pcbhpp&id=23G941&bdg=00®=ALR7
## 4 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl3.aspx?tg=0410&eid=AEWV11&ft=pcbhpp&id=23G941&bdg=00®=MEDA
## 5 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl3.aspx?tg=0506&eid=AEWV11&ft=pcbhpp&id=23G941&bdg=00®=ALR7
## 6 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl3.aspx?tg=0556&eid=AEWV11&ft=pcbhpp&id=23G941&bdg=00®=ALR7
## id
## 1 23G941
## 2 23G941
## 3 23G941
## 4 23G941
## 5 23G941
## 6 23G941
Now let’s figure out how many different reports there are for at each grade level. This will be important in order for us to figure out of it’s reasonable to attempt to look for significant relationships between certain properties, and, for example, grade E severity reports.
ratings <- c(sum(locations$A), sum(locations$B), sum(locations$C), sum(locations$D), sum(locations$E))
ratings <- as.data.frame(ratings)
ratings$letter <- c('A', 'B', 'C', 'D', 'E')
#ratings[order(ratings$ratings),c(1,2)]
ratings
## ratings letter
## 1 1579 A
## 2 2411 B
## 3 80 C
## 4 3 D
## 5 37 E
#df_uniq <- unique(select(locations, ownership))
#df_uniq
ownershipcount <- c()
ownershipcount <- c(ownershipcount, nrow(filter(locations, ownership == 'LIMITED LIABILITY')))
ownershipcount <- c(ownershipcount, nrow(filter(locations, ownership == 'PROFIT-CORPORATION')))
ownershipcount <- c(ownershipcount, nrow(filter(locations, ownership == 'LIMITED PARTNERSHIP')))
ownershipcount <- c(ownershipcount, nrow(filter(locations, ownership == 'INDIVIDUAL')))
ownershipcount <- c(ownershipcount, nrow(filter(locations, ownership == 'PARTNERSHIP')))
ownershipcount <- c(ownershipcount, nrow(filter(locations, ownership == 'CORPORATE NON-PROFIT')))
ownershipcount <- c(ownershipcount, nrow(filter(locations, ownership == 'DISTRICT')))
ownershipcount <- c(ownershipcount, nrow(filter(locations, ownership == 'CITY-COUNTY')))
ownershipcount <- c(ownershipcount, nrow(filter(locations, ownership == 'NA')))
counts <- as.data.frame(ownershipcount)
counts$names <- c('LIMITED LIABILITY', 'PROFIT-CORPORATION', 'LIMITED PARTNERSHIP', 'INDIVIDUAL', 'PARTNERSHIP', 'CORPORATE NON-PROFIT','DISTRICT', 'CITY-COUNTY', 'NA')
counts[order(counts$ownershipcount),c(1,2)]
## ownershipcount names
## 9 0 NA
## 7 3 DISTRICT
## 8 3 CITY-COUNTY
## 3 6 LIMITED PARTNERSHIP
## 4 8 INDIVIDUAL
## 5 13 PARTNERSHIP
## 6 111 CORPORATE NON-PROFIT
## 2 198 PROFIT-CORPORATION
## 1 312 LIMITED LIABILITY
It looks like there is only a small number of grade D and E reports. We will need to take this into account when looking at analysis of the frequency of these grade level reports. Furthermore the vast majority of locations are of the type ‘CORPORATE NON-PROFIT’, ‘PROFIT-CORPORATION’, and ‘LIMITED LIABILITY’ ownership.
In order to get a better analyis of the data I think it would be beneficial to group all of the other types of ownership into a single category of “OTHER”, rather than simply dropping them from the analysis altogether.
locations$ownership[locations$ownership == "PARTNERSHIP" |
locations$ownership == "INDIVIDUAL" |
locations$ownership == "LIMITED PARTNERSHIP" |
locations$ownership == "CITY-COUNTY" |
locations$ownership == "DISTRICT"] <- "OTHER"
Now we will be conducting a series of analysis to see if there is any relation between the number of beds at a location, it’s ownership, and the number and severity grade of the citation reports it has received in the past four years. For this I will employing a linear regression model.
require(ggplot2)
library(dplyr)
library(rvest)
library(tidyverse)
library(scales)
##
## Attaching package: 'scales'
## The following object is masked from 'package:purrr':
##
## discard
## The following object is masked from 'package:readr':
##
## col_factor
library(gapminder)
require(broom)
## Loading required package: broom
lm_A <- lm(A~beds+factor(ownership), data=locations)
tidy(lm_A)
## term estimate std.error statistic
## 1 (Intercept) 0.86294540 0.329252468 2.620923
## 2 beds 0.01626647 0.003241582 5.018066
## 3 factor(ownership)LIMITED LIABILITY 1.08392751 0.356709405 3.038685
## 4 factor(ownership)OTHER 0.87564060 0.640129686 1.367911
## 5 factor(ownership)PROFIT-CORPORATION 1.34493087 0.383737207 3.504823
## p.value
## 1 8.974565e-03
## 2 6.747982e-07
## 3 2.471619e-03
## 4 1.718133e-01
## 5 4.883526e-04
lm_B <- lm(B~beds+factor(ownership), data=locations)
tidy(lm_B)
## term estimate std.error statistic
## 1 (Intercept) 1.938753285 0.569585100 3.4037992
## 2 beds 0.009147722 0.005607723 1.6312719
## 3 factor(ownership)LIMITED LIABILITY 1.859967240 0.617083793 3.0141243
## 4 factor(ownership)OTHER 0.766565483 1.107382226 0.6922321
## 5 factor(ownership)PROFIT-CORPORATION 1.593327342 0.663840111 2.4001673
## p.value
## 1 0.0007054555
## 2 0.1033180925
## 3 0.0026777212
## 4 0.4890391942
## 5 0.0166682532
lm_C <- lm(C~beds+factor(ownership), data=locations)
tidy(lm_C)
## term estimate std.error statistic
## 1 (Intercept) 0.058882631 0.0474459604 1.2410463
## 2 beds 0.001322369 0.0004671187 2.8309056
## 3 factor(ownership)LIMITED LIABILITY 0.014819552 0.0514025616 0.2883038
## 4 factor(ownership)OTHER -0.023713389 0.0922440093 -0.2570724
## 5 factor(ownership)PROFIT-CORPORATION 0.039773458 0.0552973237 0.7192655
## p.value
## 1 0.215037059
## 2 0.004785353
## 3 0.773206280
## 4 0.797204358
## 5 0.472236243
#filter(recentIncidents, reportLevels....reportLevels == 'C')
These results appear to show that there is very little to no noticable relationship between the number of beds a location has and the amount of reports it receives of any grade, disproving a suspicion I had previously that smaller locations tended to have more problems.
Of much higher and more consequence is the seemingly high relation between the type of ownership of a location and the frequence of A, B, and C reports. These seem quite statistically significant- let us explore this further in the following code block.
locations %>%
filter(ownership != "NA") %>%
group_by(ownership) %>%
summarize(mean_A=mean(A)) %>%
ggplot(mapping=aes(x=ownership, y=mean_A)) +
geom_bar(stat="identity") +
labs(title = "'A' grades and ownership type")
locations %>%
filter(ownership != "NA") %>%
group_by(ownership) %>%
summarize(meanB=mean(B)) %>%
ggplot(mapping=aes(x=ownership, y=meanB)) +
geom_bar(stat="identity") +
labs(title = "'B' grades and ownership type")
locations %>%
filter(ownership != "NA") %>%
group_by(ownership) %>%
summarize(meanC=mean(C)) %>%
ggplot(mapping=aes(x=ownership, y=meanC)) +
geom_bar(stat="identity") +
labs(title = "'C' grades and ownership type")
Here we can see clearly that PROFIT-CORPORATION and LIMITED LIABILITY locations have a much higher average level of reports for grades A and B, and that NON-PROFIT locations have a lower average level of reports for A, B, and C level grades. Interestingly, ‘OTHER’ type ownership locations have high level of average ‘A’ type grade reports, but the lowest level of average ‘C’ type grade reports.
locations %>%
filter(ownership != "NA") %>%
ggplot(aes(x=factor(ownership), y=A)) +
geom_violin() +
labs(title="A rating citations 2014-present over ownership type",
x = "ownership type",
y = "number of A citations")
locations %>%
filter(ownership != "NA") %>%
ggplot(aes(x=factor(ownership), y=B)) +
geom_violin() +
labs(title="B rating citations 2014-present over ownership type",
x = "ownership type",
y = "number of B citations")
locations %>%
filter(ownership != "NA") %>%
ggplot(aes(x=factor(ownership), y=C)) +
geom_violin() +
labs(title="C rating citations 2014-present over ownership type",
x = "ownership type",
y = "number of C citations")
locations %>%
filter(ownership != "NA") %>%
ggplot(aes(x=factor(ownership), y=E)) +
geom_violin() +
labs(title="E rating citations 2014-present over ownership type",
x = "ownership type",
y = "number of E citations")
z <- 1
while(z < nrow(locations)) { #count all the grades
count <- 0
count <- count + locations[z,'A']
count <- count + locations[z,'B']
count <- count + locations[z,'C']
count <- count + locations[z,'D']
count <- count + locations[z,'E']
locations[z,'allgrades'] <- count
z <- z + 1
}
locations %>%
filter(ownership != "NA") %>%
ggplot(aes(x=factor(ownership), y=allgrades)) +
geom_violin() +
labs(title="all rating citations 2014-present over ownership type",
x = "ownership type",
y = "number of citations")
## Warning: Removed 1 rows containing non-finite values (stat_ydensity).
Potential harm to the resident(s) - A/B Actutal harm to the resident(s) - C/D Life threatening to the resident(s) - E
With these violin plots we can see that the majority of locations have very little to no problems, and that there is a small number of locations with a very high concentration of citation reports. This is good news for the purposes of avoiding these ‘problem’ locations when looking for a retirement home.
However perhaps we are not concerned with A and B citations in and of themselves- after all, these are only related to ‘potential harm to actual residents’, right? This brings up an important question- how does the frequency of A and B citations relate to probability of having an E-rating citation?
lm_all <- lm(E~A+B+D+C, data=locations)
tidy(lm_all)
## term estimate std.error statistic p.value
## 1 (Intercept) -0.013965327 0.013739621 -1.016427 0.3098035185
## 2 A 0.012164609 0.004427949 2.747234 0.0061763348
## 3 B 0.009030737 0.002675418 3.375449 0.0007807516
## 4 D 0.310007462 0.161811333 1.915858 0.0558209336
## 5 C 0.053127595 0.024758711 2.145814 0.0322569108
The above results seem to show that there is a statistically significant relationship between the number of grade C or D citations and the chances of having an E grade citation. Intiuitively this makes sense- a location that is more likely to have been reported for having caused harm to residents is probably more likely to have caused life threatening harm as well.
z <- 1
while(z <= nrow(locations)) {
if(locations[z,"E"] > 0) {
locations[z,"hasE"] <- "yes"
} else {
locations[z,"hasE"] <- "no"
}
z <- z + 1
} #add categorical variable for presence of E grade ratings
locations %>%
ggplot(aes(x=hasE, y=allgrades)) +
geom_violin() +
labs(title="all citations 2014-present for locations with and without E ratings",
y = "number of citations",
x = "E grades present")
## Warning: Removed 1 rows containing non-finite values (stat_ydensity).
The above plot seems to show that for locations that do have grade E ratings they are more likely to have a higher number of citations than those without, although there are still a fair bulk that are within normal amounts of citations.
Overall the analysis seems to prove that NON-PROFIT locations come with the lowest citations, whereas PROFIT-CORPORATION locations have the highest number of citations, and there is a strong relationship between the number of C or D citations and E level citations. The violin plots appear to show that there is a concentration of locations with very high rates of citations, indicating that there are in fact ‘problem’ locations that can be avoided in order to improve chances of finding a good elderly care home.
In this section we will go about creating a leaflet map that will allow us to look through locations and their information.
To begin we will need to a little bit more scraping in order to find the latitude and longitude coordinates of these locations. Luckily, Google Maps includes the latitude and longitude in their search results. We will input the addresses we scraped from each location into Google Maps and scrape the results. For whatever reason, likely due to some stochasticity in the behavior of Google’s algorithms, querying once tends to only yield results roughly 65% of the time. By querying four times for each location we can dramatically minimize the number of locations with no latitude/longitude data.
require(geonames)
## Loading required package: geonames
## No geonamesUsername set. See http://geonames.wordpress.com/2010/03/16/ddos-part-ii/ and set one with options(geonamesUsername="foo") for some services to work
geocodeAdddress <- function(address) { #returns latitude and longitude using google maps
require(RJSONIO)
url <- "http://maps.google.com/maps/api/geocode/json?address="
url <- URLencode(paste(url, address, "&sensor=false", sep = ""))
x <- fromJSON(url, simplify = FALSE)
if (x$status == "OK") {
out <- c(x$results[[1]]$geometry$location$lng,
x$results[[1]]$geometry$location$lat)
} else {
out <- NA
}
Sys.sleep(0.2) # API only allows 5 requests per second
out
}
print(nrow(locations))
## [1] 656
z <- 1
while(z <= nrow(locations)) {
tries <- 1
found <- FALSE
while(!found & tries <= 4) { #try four times
address <- (strsplit(locations[z, "demographies....demographies"], "\nTelephone: "))[[1]][1]
latlong <- geocodeAdddress(address)
locations[z,"long"] <- latlong[1]
locations[z,"lat"] <- latlong[2]
if(!is.na(locations[z,"lat"]))
{
found <- TRUE
}
tries <- tries + 1
}
z <- z + 1
}
## Loading required package: RJSONIO
Now we will create the popup content that will be displayed on the map using a lot of string manipulation. When we click on icons on our map we will see the name of the location which will also contain a hyperlink to the location’s page in the database. We will display the address of the location, the ownership type, the number of beds, and the total citations since 2014 of the location. Finally we will show the number of A, B, C, D, and E citations since 2014.
locations$numericLAT <- as.numeric(locations$long)
locations$numericLONG <- as.numeric(locations$lat) # make sure they're numeric
z <- 1
while(z <= nrow(locations)) { #create pop-up content
address <- (strsplit(locations[z, "demographies....demographies"], "\nTelephone: "))[[1]][1]
address <- (strsplit(address, "\n"))
content <- paste(
sprintf("<b><a href='%s'>%s</a><br></b>", locations[z,"tophrefs....hrefs"], locations[z,"names....names"]),
sprintf("%s<br>", address[[1]][1]),
sprintf("%s<br>", address[[1]][2]),
sprintf("%s<br>", address[[1]][3]),
sprintf("Ownership Type: %s<br>", locations[z,"ownership"]),
sprintf("Number of Beds: %s<br>", locations[z,"beds"]),
sprintf("Total Citations: %s<br>", locations[z,"allgrades"]),
sprintf("A:%s B:%s C:%s D:%s E:%s<br>",
locations[z,"A"], locations[z,"B"], locations[z,"C"], locations[z,"D"], locations[z,"E"])
)
locations[z, 'summary'] <- content
z <- z + 1
}
head(locations)
## names....names
## 1 A DOCTOR'S TOUCH LLC
## 2 A FEATHERED NEST AT THORNTON
## 3 A LEGACY PERSONAL CARE HOME
## 4 A LOVING HAND ASSISTED LIVING INC
## 5 A ROBIN'S NEST
## 6 A WILDFLOWER ASSISTED LIVING AND CARE HOME INC
## tophrefs....hrefs
## 1 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=23G941&ft=pcbhpp
## 2 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=2304JA&ft=pcbhpp
## 3 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=23R668&ft=pcbhpp
## 4 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=2304ZS&ft=pcbhpp
## 5 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=23052H&ft=pcbhpp
## 6 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=23J158&ft=pcbhpp
## demographies....demographies
## 1 A DOCTOR'S TOUCH LLC\n1550 HIAWATHA DRIVE\nCOLORADO SPRINGS, CO 80915 - EL PASO COUNTY\nTelephone: (719)638-8198, Fax: (719)694-3969\n\nAdministrator/Contact: Mr HERALD OSTOVAR\nAssisted Living Residence - Medicaid Certified\nLicensed Beds: 8\nOwnership type: LIMITED LIABILITY\nCurrent ownership effective: 1/14/2008\nOmbudsman Phone: (719)471-7080
## 2 A FEATHERED NEST AT THORNTON\n11540 MILWAUKEE ST\nTHORNTON, CO 80233 - ADAMS COUNTY\nTelephone: (303)453-0810, Fax: (303)453-0810\n\nAdministrator/Contact: MS ANNETTE FARRELL\nAssisted Living Residence - Medicaid Certified\nLicensed Beds: 7\nOwnership type: PROFIT-CORPORATION\nCurrent ownership effective: 10/2/1998\nOmbudsman Phone: (303)480-5624
## 3 A LEGACY PERSONAL CARE HOME\n4050 SOUTH FOX STREET\nENGLEWOOD, CO 80110 - ARAPAHOE COUNTY\nTelephone: (303)783-4989, Fax: (303)635-6719\n\nAdministrator/Contact: Mr SCOTT BOYLE\nAssisted Living Residence - Private Pay\nLicensed Beds: 8\nOwnership type: PROFIT-CORPORATION\nCurrent ownership effective: 7/21/2015\nOmbudsman Phone: (303)480-5624
## 4 A LOVING HAND ASSISTED LIVING INC\n3079 S HOLLY PLACE\nDENVER, CO 80222 - ARAPAHOE COUNTY\nTelephone: (303)782-6951, Fax: (303)504-6038\n\nAdministrator/Contact: Ms JANNELLE MOLINA\nAssisted Living Residence - Medicaid Certified\nLicensed Beds: 8\nOwnership type: PROFIT-CORPORATION\nCurrent ownership effective: 5/8/2007\nOmbudsman Phone: (303)480-5624
## 5 A ROBIN'S NEST\n3182 E OAK CREEK DR\nCOLORADO SPRINGS, CO 80906 - EL PASO COUNTY\nTelephone: (719)226-2941, Fax: (719)226-2942\n\nAdministrator/Contact: MS MARY LOUISE KOURI\nAssisted Living Residence - Private Pay\nLicensed Beds: 8\nOwnership type: PROFIT-CORPORATION\nCurrent ownership effective: 2/23/2000\nOmbudsman Phone: (719)471-7080
## 6 A WILDFLOWER ASSISTED LIVING AND CARE HOME INC\n8417 W 74TH PLACE\nARVADA, CO 80005 - JEFFERSON COUNTY\nTelephone: (303)456-7890, Fax: (866)941-5820\n\nAdministrator/Contact: Ms NICOLE SCHIAVONE\nAssisted Living Residence - Medicaid Certified\nLicensed Beds: 8\nOwnership type: PROFIT-CORPORATION\nCurrent ownership effective: 2/8/2018\nOmbudsman Phone: (303)480-5624\n\nMail c/o: A WILDFLOWER ASSISTED LIVING\n1140 US HWY 281, STE 400-287\nBROOMFIELD,CO 80020
## occurencesHrefs....occurrencesHrefs
## 1 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtlocc06.aspx?id=2304A9&ft=pcbhpp
## 2 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtlocc06.aspx?id=2304A9&ft=pcbhpp
## 3 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtlocc06.aspx?id=2304A9&ft=pcbhpp
## 4 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtlocc06.aspx?id=2304A9&ft=pcbhpp
## 5 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtlocc06.aspx?id=2304A9&ft=pcbhpp
## 6 http://www.hfemsd2.dphe.state.co.us/hfd2003/dtlocc06.aspx?id=2304A9&ft=pcbhpp
## occurencesTexts....occurrencesTexts
## 1 OCCURRENCES:\n\nOccurrence Investigative Reports Released from 5/18/2015 to 5/18/2018 for A DOCTOR'S TOUCH LLC\n\nAbout the Occurrence Reporting System\nNote to Consumers
## 2 OCCURRENCES:\n\nOccurrence Investigative Reports Released from 5/18/2015 to 5/18/2018 for A FEATHERED NEST AT THORNTON\n\nAbout the Occurrence Reporting System\nNote to Consumers
## 3 OCCURRENCES:\n\nOccurrence Investigative Reports Released from 5/18/2015 to 5/18/2018 for A LEGACY PERSONAL CARE HOME\n\nAbout the Occurrence Reporting System\nNote to Consumers
## 4 OCCURRENCES:\n\nOccurrence Investigative Reports Released from 5/18/2015 to 5/18/2018 for A LOVING HAND ASSISTED LIVING INC\n\nAbout the Occurrence Reporting System\nNote to Consumers
## 5 OCCURRENCES:\n\nOccurrence Investigative Reports Released from 5/18/2015 to 5/18/2018 for A ROBIN'S NEST\n\nAbout the Occurrence Reporting System\nNote to Consumers
## 6 OCCURRENCES:\n\nOccurrence Investigative Reports Released from 5/18/2015 to 5/18/2018 for A WILDFLOWER ASSISTED LIVING AND CARE HOME INC\n\nAbout the Occurrence Reporting System\nNote to Consumers
## healthSurveyCount....healthSurveyCount
## 1 5
## 2 3
## 3 3
## 4 3
## 5 4
## 6 2
## lifeSafetySurveyCount....lifeSafetySurveyCount
## 1 1
## 2 1
## 3 1
## 4 1
## 5 1
## 6 0
## accessTimes....accessTimes id ownership beds A B C D E
## 1 Friday, May 18, 2018 1:48 AM 23G941 LIMITED LIABILITY 8 5 12 0 0 0
## 2 Friday, May 18, 2018 1:48 AM 2304JA PROFIT-CORPORATION 7 2 4 0 0 0
## 3 Friday, May 18, 2018 1:48 AM 23R668 PROFIT-CORPORATION 8 0 5 0 0 0
## 4 Friday, May 18, 2018 1:48 AM 2304ZS PROFIT-CORPORATION 8 8 2 0 0 0
## 5 Friday, May 18, 2018 1:48 AM 23052H PROFIT-CORPORATION 8 0 1 0 0 0
## 6 Friday, May 18, 2018 1:48 AM 23J158 PROFIT-CORPORATION 8 0 4 0 0 0
## allgrades hasE long lat numericLAT numericLONG
## 1 17 no NA NA NA NA
## 2 6 no NA NA NA NA
## 3 5 no NA NA NA NA
## 4 10 no NA NA NA NA
## 5 1 no NA NA NA NA
## 6 4 no NA NA NA NA
## summary
## 1 <b><a href='http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=23G941&ft=pcbhpp'>A DOCTOR'S TOUCH LLC</a><br></b> A DOCTOR'S TOUCH LLC<br> 1550 HIAWATHA DRIVE<br> COLORADO SPRINGS, CO 80915 - EL PASO COUNTY<br> Ownership Type: LIMITED LIABILITY<br> Number of Beds: 8<br> Total Citations: 17<br> A:5 B:12 C:0 D:0 E:0<br>
## 2 <b><a href='http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=2304JA&ft=pcbhpp'>A FEATHERED NEST AT THORNTON</a><br></b> A FEATHERED NEST AT THORNTON<br> 11540 MILWAUKEE ST<br> THORNTON, CO 80233 - ADAMS COUNTY<br> Ownership Type: PROFIT-CORPORATION<br> Number of Beds: 7<br> Total Citations: 6<br> A:2 B:4 C:0 D:0 E:0<br>
## 3 <b><a href='http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=23R668&ft=pcbhpp'>A LEGACY PERSONAL CARE HOME</a><br></b> A LEGACY PERSONAL CARE HOME<br> 4050 SOUTH FOX STREET<br> ENGLEWOOD, CO 80110 - ARAPAHOE COUNTY<br> Ownership Type: PROFIT-CORPORATION<br> Number of Beds: 8<br> Total Citations: 5<br> A:0 B:5 C:0 D:0 E:0<br>
## 4 <b><a href='http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=2304ZS&ft=pcbhpp'>A LOVING HAND ASSISTED LIVING INC</a><br></b> A LOVING HAND ASSISTED LIVING INC<br> 3079 S HOLLY PLACE<br> DENVER, CO 80222 - ARAPAHOE COUNTY<br> Ownership Type: PROFIT-CORPORATION<br> Number of Beds: 8<br> Total Citations: 10<br> A:8 B:2 C:0 D:0 E:0<br>
## 5 <b><a href='http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=23052H&ft=pcbhpp'>A ROBIN'S NEST</a><br></b> A ROBIN'S NEST<br> 3182 E OAK CREEK DR<br> COLORADO SPRINGS, CO 80906 - EL PASO COUNTY<br> Ownership Type: PROFIT-CORPORATION<br> Number of Beds: 8<br> Total Citations: 1<br> A:0 B:1 C:0 D:0 E:0<br>
## 6 <b><a href='http://www.hfemsd2.dphe.state.co.us/hfd2003/dtl.aspx?id=23J158&ft=pcbhpp'>A WILDFLOWER ASSISTED LIVING AND CARE HOME INC</a><br></b> A WILDFLOWER ASSISTED LIVING AND CARE HOME INC<br> 8417 W 74TH PLACE<br> ARVADA, CO 80005 - JEFFERSON COUNTY<br> Ownership Type: PROFIT-CORPORATION<br> Number of Beds: 8<br> Total Citations: 4<br> A:0 B:4 C:0 D:0 E:0<br>
Below we will create the icons for the map and then finally create the leaflet map. For this we will import the leaflet and the htmltools package.
library(leaflet)
## Warning: package 'leaflet' was built under R version 3.4.4
library(leaflet.extras)
library(htmltools)
houseIcon <- makeIcon(
iconUrl = "https://cdn2.iconfinder.com/data/icons/pittogrammi/142/65-512.png",
iconWidth = 20, iconHeight = 20,
iconAnchorX = 0, iconAnchorY = 0
) #initialize icons
foundlocations <- filter(locations, !is.na(lat))
#filter out locations with no lat/long data
colorado_map <- leaflet(foundlocations) %>%
addTiles() %>%
setView(lat=39.0501, lng=-105.80501 , zoom=7) %>%
addMarkers(lng = ~numericLONG, lat = ~numericLAT, icon = houseIcon, popup = ~summary)
colorado_map
There are several locations that google invariably provides incorrect coordinates on which are located far outside of Colorado, however they still have correct addresses and information.
Thank you for reading this tutorial and I hope it provided some insight into how to scrape a large multi-level web database as well as how to analyze variety of quality in elderly homes. I hope I was successful in demonstrating the statistical importance of how they are owned, the distribution of report citations, and in giving a way to browse locations in the state of Colorado through the interactive map.